Running head: SELECTION OF AUXILIARY VARIABLES 1 Selection of auxiliary variables in missing data problems: Not all auxiliary variables are created equal

نویسندگان

  • Felix Thoemmes
  • Norman Rose
چکیده

The treatment of missing data in the social sciences has changed tremendously during the last decade. Modern missing data techniques such as multiple imputation and full-information maximum likelihood are used much more frequently. These methods assume that data are missing at random. One very common approach to increase the likelihood that missing at random is achieved, consists of including many covariates as so-called auxiliary variables. These variables are either included based on data considerations or in an inclusive fashion, i.e., taking all available auxiliary variables. However, neither approach accounts for the fact that under a wide range of circumstances there is a class of variables that, when used as auxiliary variables, will always increase bias in the estimation of parameters from data with missing values. In this paper we show that this bias exists, quantify it in a simulation study, and discuss possible ways how one can avoid selecting bias-inducing covariates as auxiliary variables.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Cautious Note on Auxiliary Variables That Can Increase Bias in Missing Data Problems.

The treatment of missing data in the social sciences has changed tremendously during the last decade. Modern missing data techniques such as multiple imputation and full-information maximum likelihood are used much more frequently. These methods assume that data are missing at random. One very common approach to increase the likelihood that missing at random is achieved consists of including ma...

متن کامل

The Impact of Measurement Error in Auxiliary Variables on Model-Based Estimation of Finite Population Totals: A Simulation Study

Model-based prediction theory for finite population sampling and inference (Valliant et al., 2000) largely assumes that auxiliary variables are available for all units in the target population. These auxiliary variables play many important roles in prediction theory: they are used to 1) select samples balanced on the auxiliary variables that, when combined with (theoretically) appropriate predi...

متن کامل

Auxiliary variables in multiple imputation in regression with missing X: a warning against including too many in small sample research

BACKGROUND Multiple imputation is becoming increasingly popular. Theoretical considerations as well as simulation studies have shown that the inclusion of auxiliary variables is generally of benefit. METHODS A simulation study of a linear regression with a response Y and two predictors X1 and X2 was performed on data with n = 50, 100 and 200 using complete cases or multiple imputation with 0,...

متن کامل

Variable Selection for Regression Estimation in the Presence of Nonresponse

Nascimento Silva and Skinner (1997) (hereafter NS) consider the selection of auxiliary variables in the regression estimation of finite population means under simple random sampling. They consider the classical objective of regression estimation, which is to improve precision compared to the sample mean, and note that the variance of the regression estimator is not necessarily minimised by incl...

متن کامل

Second-order asymptotic theory for calibration estimators in sampling and missing-data problems

Consider three different but related problems with auxiliary information: infinite population sampling or Monte Carlo with control variates, missing response with explanatory variables, and Poisson and rejective sampling with auxiliary variables. We demonstrate unified regression and likelihood estimators and study their second-order properties. The likelihood estimators are second-order unbias...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013